智能论文笔记

Review on Social Behavior Analysis of Laboratory Animals: From Methodologies to Applications

Ziping Jiang , Paul L. Chazot , Richard Jiang

分类：计算机视觉 | 机器学习

2022-06-25

作为遗传和生理方面之间的桥梁，动物行为分析是生物学和生态学研究中最重要的主题之一。但是，识别，跟踪和记录动物行为是需要专业知识的劳动密集型作品。为了减轻注释数据的支出，研究人员转向用于自动标签算法的计算机视觉技术，因为大多数数据都是视觉记录的。在这项工作中，我们探讨了各种行为检测算法，涵盖了传统的视觉方法，统计方法和深度学习方法。这项工作的目的是对相关工作进行彻底的研究，为生物学家提供有效的动物行为检测方法。除此之外，我们还讨论了这些算法的优势和缺点，以为已经深入研究该领域的人们提供一些见解。

translated by 谷歌翻译

Machine Learning-based Biological Ageing Estimation Technologies: A Survey

Zhaonian Zhang , Richard Jiang , Danny Crookes , Paul Chazot

分类：计算机视觉 | 人工智能 | 机器学习

2022-06-25

近年来，已经开发了各种估计生物年龄（BA）的方法。尤其是随着机器学习（ML）的发展，BA的预测有越来越多的类型，并且准确性得到了极大的提高。估计BA的模型在监测健康衰老方面起着重要作用，并可以提供新的工具来检测普通人群的健康状况并向较不健康的人发出警告。我们将主要使用ML回顾三种年龄预测方法。它们基于血液生物标志物，面部图像和结构神经影像学特征。目前，使用血液生物标志物的模型是最简单，最直接，最准确的方法。面部图像方法受种族，环境等各个方面的影响，预测准确性不是很好，这不能为医疗领域做出巨大贡献。总而言之，我们在这里为我们和其他潜在的一般人群的大数据时代跟踪前进的方向，并展示了利用当今可用的大量数据的方式。

translated by 谷歌翻译

Mapping Knowledge Representations to Concepts: A Review and New Perspectives

Lars Holmberg , Paul Davidsson , Per Linde

分类：人工智能 | 机器学习

2022-12-31

The success of neural networks builds to a large extent on their ability to create internal knowledge representations from real-world high-dimensional data, such as images, sound, or text. Approaches to extract and present these representations, in order to explain the neural network's decisions, is an active and multifaceted research field. To gain a deeper understanding of a central aspect of this field, we have performed a targeted review focusing on research that aims to associate internal representations with human understandable concepts. In doing this, we added a perspective on the existing research by using primarily deductive nomological explanations as a proposed taxonomy. We find this taxonomy and theories of causality, useful for understanding what can be expected, and not expected, from neural network explanations. The analysis additionally uncovers an ambiguity in the reviewed literature related to the goal of model explainability; is it understanding the ML model or, is it actionable explanations useful in the deployment domain?

translated by 谷歌翻译

On Implicit Bias in Overparameterized Bilevel Optimization

Paul Vicol , Jonathan Lorraine , Fabian Pedregosa , David Duvenaud , Roger Grosse

分类：机器学习

2022-12-28

Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.

translated by 谷歌翻译

Brain Cancer Segmentation Using YOLOv5 Deep Neural Network

Sudipto Paul , Dr. Md Taimur Ahad , Md. Mahedi Hasan

分类：计算机视觉

2022-12-27

An expansion of aberrant brain cells is referred to as a brain tumor. The brain's architecture is extremely intricate, with several regions controlling various nervous system processes. Any portion of the brain or skull can develop a brain tumor, including the brain's protective coating, the base of the skull, the brainstem, the sinuses, the nasal cavity, and many other places. Over the past ten years, numerous developments in the field of computer-aided brain tumor diagnosis have been made. Recently, instance segmentation has attracted a lot of interest in numerous computer vision applications. It seeks to assign various IDs to various scene objects, even if they are members of the same class. Typically, a two-stage pipeline is used to perform instance segmentation. This study shows brain cancer segmentation using YOLOv5. Yolo takes dataset as picture format and corresponding text file. You Only Look Once (YOLO) is a viral and widely used algorithm. YOLO is famous for its object recognition properties. You Only Look Once (YOLO) is a popular algorithm that has gone viral. YOLO is well known for its ability to identify objects. YOLO V2, V3, V4, and V5 are some of the YOLO latest versions that experts have published in recent years. Early brain tumor detection is one of the most important jobs that neurologists and radiologists have. However, it can be difficult and error-prone to manually identify and segment brain tumors from Magnetic Resonance Imaging (MRI) data. For making an early diagnosis of the condition, an automated brain tumor detection system is necessary. The model of the research paper has three classes. They are respectively Meningioma, Pituitary, Glioma. The results show that, our model achieves competitive accuracy, in terms of runtime usage of M2 10 core GPU.

translated by 谷歌翻译

Large Language Models Encode Clinical Knowledge

Karan Singhal , Shekoofeh Azizi , Tao Tu , S. Sara Mahdavi , Jason Wei , Hyung Won Chung , Nathan Scales , Ajay Tanwani , Heather Cole-Lewis , Stephen Pfohl

分类：自然语言处理

2022-12-26

Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.

translated by 谷歌翻译

Generalizable Natural Language Processing Framework for Migraine Reporting from Social Media

Yuting Guo , Swati Rajwal , Sahithi Lakamana , Chia-Chun Chiang , Paul C. Menell , Adnan H. Shahid , Yi-Chieh Chen , Nikita Chhabra , Wan-Ju Chao , Chieh-Ju Chao

分类：自然语言处理

2022-12-23

Migraine is a high-prevalence and disabling neurological disorder. However, information migraine management in real-world settings could be limited to traditional health information sources. In this paper, we (i) verify that there is substantial migraine-related chatter available on social media (Twitter and Reddit), self-reported by migraine sufferers; (ii) develop a platform-independent text classification system for automatically detecting self-reported migraine-related posts, and (iii) conduct analyses of the self-reported posts to assess the utility of social media for studying this problem. We manually annotated 5750 Twitter posts and 302 Reddit posts. Our system achieved an F1 score of 0.90 on Twitter and 0.93 on Reddit. Analysis of information posted by our 'migraine cohort' revealed the presence of a plethora of relevant information about migraine therapies and patient sentiments associated with them. Our study forms the foundation for conducting an in-depth analysis of migraine-related information using social media data.

translated by 谷歌翻译

Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models

Najoung Kim , Tal Linzen , Paul Smolensky

分类：自然语言处理

2022-12-21

Human linguistic capacity is often characterized by compositionality and the generalization it enables -- human learners can produce and comprehend novel complex expressions by composing known parts. Several benchmarks exploit distributional control across training and test to gauge compositional generalization, where certain lexical items only occur in limited contexts during training. While recent work using these benchmarks suggests that pretrained models achieve impressive generalization performance, we argue that exposure to pretraining data may break the aforementioned distributional control. Using the COGS benchmark of Kim and Linzen (2020), we test two modified evaluation setups that control for this issue: (1) substituting context-controlled lexical items with novel character sequences, and (2) substituting them with special tokens represented by novel embeddings. We find that both of these setups lead to lower generalization performance in T5 (Raffel et al., 2020), suggesting that previously reported results have been overestimated due to uncontrolled lexical exposure during pretraining. The performance degradation is more extreme with novel embeddings, and the degradation increases with the amount of pretraining data, highlighting an interesting case of inverse scaling.

translated by 谷歌翻译

evoML Yellow Paper: Evolutionary AI and Optimisation Studio

Lingbo Li , Leslie Kanthan , Michail Basios , Fan Wu , Manal Adham , Vitali Avagyan , Alexis Butler , Paul Brookes , Rafail Giavrimis , Buhong Liu

分类：人工智能

2022-12-20

Machine learning model development and optimisation can be a rather cumbersome and resource-intensive process. Custom models are often more difficult to build and deploy, and they require infrastructure and expertise which are often costly to acquire and maintain. Machine learning product development lifecycle must take into account the need to navigate the difficulties of developing and deploying machine learning models. evoML is an AI-powered tool that provides automated functionalities in machine learning model development, optimisation, and model code optimisation. Core functionalities of evoML include data cleaning, exploratory analysis, feature analysis and generation, model optimisation, model evaluation, model code optimisation, and model deployment. Additionally, a key feature of evoML is that it embeds code and model optimisation into the model development process, and includes multi-objective optimisation capabilities.

translated by 谷歌翻译

Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment

Rohan Pandey , Rulin Shao , Paul Pu Liang , Ruslan Salakhutdinov , Louis-Philippe Morency

分类：自然语言处理 | 计算机视觉 | 机器学习

2022-12-20

Despite recent progress towards scaling up multimodal vision-language models, these models are still known to struggle on compositional generalization benchmarks such as Winoground. We find that a critical component lacking from current vision-language models is relation-level alignment: the ability to match directional semantic relations in text (e.g., "mug in grass") with spatial relationships in the image (e.g., the position of the mug relative to the grass). To tackle this problem, we show that relation alignment can be enforced by encouraging the directed language attention from 'mug' to 'grass' (capturing the semantic relation 'in') to match the directed visual attention from the mug to the grass. Tokens and their corresponding objects are softly identified using the cross-modal attention. We prove that this notion of soft relation alignment is equivalent to enforcing congruence between vision and language attention matrices under a 'change of basis' provided by the cross-modal attention matrix. Intuitively, our approach projects visual attention into the language attention space to calculate its divergence from the actual language attention, and vice versa. We apply our Cross-modal Attention Congruence Regularization (CACR) loss to UNITER and improve on the state-of-the-art approach to Winoground.

translated by 谷歌翻译